A Structured Document Preparation System- AutoLayouter Version 2.0 -An Enhancement for Handling Multiple Document Types
نویسندگان
چکیده
Autohyouter is a structured document preparation system used to increase efficiency in creating and reusing designed documents in offices. AutoLavouter consists of an easy-to-use structured editor and a Japanese Brn i -based formatter. With a structured editor. the user need not be concerned with page layout, and can concentrate on creating the contents of the document. Because these documents are structured logically, they can be easily reused or processed further by other systems. At the 1990 TUG meeting, we presented AutoLavouter version 1.0. Since then we have been improving the system to handle more complicated document structures, such as are defined in SGML. In this paper, we describe 1 ) new document structures, and 2) ALmY, which directly formats structured documents. Introduction a text formatter for logically structured documents. Recent research projects on document processing have been directed a t structured document representations, such as SGML. The basic idea of a structured document is to separate a document into structure and content; its contents are the11 extracted in terms of its structure. In an SGML document. the structure is defined explicitly as a DTD (Document Type Definition), so that docume~its created with the same DTD are interchangable. Such a structure can also be used by a document processing system to retrieve the required information: for instance, the title, author, and date of technical reports can be retrieved through their structure and merged into a summary table. The structured document representation, especially the logically structured one, is essential to making the best use of electronic documents. We can store documents in electronic format, and load and print them on paper, using conventional word processor o r desktop publishi~~g systems. These documents cannot be processed by other systems, however, unless the logical meanings of their contents are preserved, because there is no other way to identify the contents. Because of its abstract, declarative language, LATEX is often referred to as an example of L A W is used as a document preparation tool by computer software engineers because they can use any editor and can concentrate on a document's content and structure without paying any attention to its physical appearance. In Japan. the advance of word processing technology has meant that business documents are prepared and stored electronically, but they must also be kept in printed form. The format of most Japanese business documents separates items with rule lines. This standardizes the items to be written and determines the text area available for each item. Japanese word processors possess some characteristics for editing these forms: they draw ruled lines and insert text in the area surrounded by the rules. However, this augmentation of rule-line functions has made it too complex to manage document files and to reuse document contents. As a result, a document must still be managed in the printed form, even though it is stored in an electronic format. To solve these problems, we have developed a structured document preparation system, Aut0La.youter, whose objective is to increase efficiency in creating and reusing preformed documents. AutoLayouter consists of a structured editor for creating SGML-like documents, and a Japanese U W 422 TUGboat, Volume 12 (1991), No. 3-Proceedings of the 1991 Annual Meeting Structured Documents Preparation System AutoLayouter Version 2.0 based formatter called A L W . In the subsequent sections of this paper. we mainly describe the document structures of Autolayouter and implementation issues of A L W formatter.
منابع مشابه
Structured Document Preparation System AutoLayouter
We have developed a structured document preparation system AutoLayouter, which consists of an easy-to-use structured editor and a Japanese I4W based formatter. Not only have we designed better user interfaces, but we have introduced a simple document structure. A document produced with AutoLayouter is a one-dimensional list with each node corresponding to a logical component of the document. Th...
متن کاملA survey on Automatic Text Summarization
Text summarization endeavors to produce a summary version of a text, while maintaining the original ideas. The textual content on the web, in particular, is growing at an exponential rate. The ability to decipher through such massive amount of data, in order to extract the useful information, is a major undertaking and requires an automatic mechanism to aid with the extant repository of informa...
متن کاملAn Ensemble Click Model for Web Document Ranking
Annually, web search engine providers spend more and more money on documents ranking in search engines result pages (SERP). Click models provide advantageous information for ranking documents in SERPs through modeling interactions among users and search engines. Here, three modules are employed to create a hybrid click model; the first module is a PGM-based click model, the second module in a d...
متن کاملLearning Document Image Features With SqueezeNet Convolutional Neural Network
The classification of various document images is considered an important step towards building a modern digital library or office automation system. Convolutional Neural Network (CNN) classifiers trained with backpropagation are considered to be the current state of the art model for this task. However, there are two major drawbacks for these classifiers: the huge computational power demand for...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011